启动和抗精气可以通过错误驱动的学习来建模(Marsolek,2008),假设学习质量的影响对目标刺激的处理进行了学习。这意味着参与者在启动研究中不断学习,并预测他们在其他心理语言实验的每项试验中也在学习。这项研究调查了在词汇决策实验中是否可以检测到试验学习。我们使用了判别词典模型(DLM; Baayen等,2019),这是一种具有分布语义的含义表示的精神词典模型,该模型具有分布语义的含义表示,该模型以Widrow-hoff规则为增量学习模型。我们使用了英国词典项目(BLP; Keuleers等,2012)的数据,并对每个受试者单独进行试用基础进行了DLM模拟词汇决策实验。然后,使用源自DLM模拟作为预测因子的措施预测单词和非单词的反应时间。使用两个受试者的数据开发模型,并对所有其他受试者进行了测试。我们从两个模拟中为每个主题提取了措施(一个在试验之间进行了学习更新,一个没有),并将其用作两个GAM的输入。基于学习的模型比大多数受试者的非学习模型表现出更好的模型拟合度。我们的措施还提供了有关词汇处理的见解,并使我们能够通过线性混合模型探索个体差异。这证明了DLM对行为数据进行建模的潜力,并得出这样的结论:在心理语言实验中确实可以检测到试验到审判的学习。
translated by 谷歌翻译
当前的计算模型捕获单词的含义主要取决于文本语料库。尽管这些方法在过去几十年中取得了成功,但它们在现实世界中缺乏基础仍然是一个持续的问题。在本文中,我们专注于单词嵌入的视觉接地,并针对两个重要问题。首先,在视觉接地过程中,语言如何从视觉中受益?其次,视觉接地和抽象概念之间是否存在联系?我们通过提出一种简单而有效的方法来调查这些问题,在该方法中,语言在具体和抽象词的建模方面特别受益于视觉。我们的模型将单词嵌入与其相应的视觉表示形式对齐,而不会降低文本分布信息所捕获的知识。我们将模型应用于G \“ Unther等人(2020)报告的行为实验,该实验解决了抽象单词的视觉心理表示的合理性。我们的评估结果表明:(1)可以预测人类行为(2)与文本对应物相比,我们的接地嵌入方式在很大程度上更好地模型。(3)抽象的概念通过其与具体概念的连接而不是具有相应的视觉表现方式,从而从视觉接地中受益。
translated by 谷歌翻译
语言基础与视觉是一个积极的研究领域,旨在通过利用视觉感知知识来丰富基于文本的单词含义的表示。尽管进行了多次接地尝试,但仍不清楚如何以一种保持文本和视觉知识的适当平衡的方式将视觉知识注入语言嵌入一词。一些普遍的问题是以下内容。视觉基础对抽象单词有益吗?还是仅限于具体单词的贡献?弥合文本和视觉之间差距的最佳方法是什么?通过视觉接地的文本嵌入,我们可以获得多少收益?本研究通过提出一种简单但非常有效的基础方法来解决这些问题,以预先训练的单词嵌入。我们的模型将文本嵌入与视觉保持一致,同时在很大程度上保留了在文本语料库中使用单词使用的分布统计数据。通过应用学习的对齐方式,我们能够生成视觉接地的嵌入,用于看不见的单词,包括抽象单词。一系列对单词相似性基准的评估表明,视觉接地不仅对具体单词有益,而且对抽象单词也有益。我们还表明,我们的视觉接地方法为上下文化的嵌入提供了优势,但只有在对相对尺寸相对较小的语料库进行培训时,我们才能提供优势。可以在https://github.com/hazel1994/visaly_grounded_word_word_embeddings_2上获得英语的代码和接地嵌入。
translated by 谷歌翻译
该研究解决了在用线性鉴别学习建模拐点形态时出现的一系列方法问题。以半成本德国名词系统为例,我们说明了如何对表单和意义的代表作出的决策如何影响模型性能。我们澄清,为了建模频率效应在学习中,必须利用增量学习而不是学习的肠胃。我们还讨论如何设置模型,以近似语境中的流动词的学习。此外,我们说明了如何在这种方法中如何以相当大的细节建模。通常,该模型为已知的单词提供了优异的存储器,但适当地对未经展示数据进行了更有限的性能,符合德国原住民的德国名词拐点和泛化性能的半生产力。
translated by 谷歌翻译
Many challenging reinforcement learning (RL) problems require designing a distribution of tasks that can be applied to train effective policies. This distribution of tasks can be specified by the curriculum. A curriculum is meant to improve the results of learning and accelerate it. We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning, where a task sequence is created based on the success rate of each task. In this setting, each task is an algorithmically created environment instance with a unique configuration. The algorithm selects the order of tasks that provide the fastest learning for agents. The probability of selecting any of the tasks for the next stage of learning is determined by evaluating its performance score in previous stages. Experiments were carried out in the Partially Observable Grid Environment for Multiple Agents (POGEMA) and Procgen benchmark. We demonstrate that SITP matches or surpasses the results of other curriculum design methods. Our method can be implemented with handful of minor modifications to any standard RL framework and provides useful prioritization with minimal computational overhead.
translated by 谷歌翻译
This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning. In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills. Our solution is based on fine-tuning the T5 model on the initial dataset augmented according to syntactical dependencies of the words located within indicated error span. The solution of our team "nigula" obtained second place according to manual evaluation by the organizers.
translated by 谷歌翻译
The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.
translated by 谷歌翻译
Model calibration, which is concerned with how frequently the model predicts correctly, not only plays a vital part in statistical model design, but also has substantial practical applications, such as optimal decision-making in the real world. However, it has been discovered that modern deep neural networks are generally poorly calibrated due to the overestimation (or underestimation) of predictive confidence, which is closely related to overfitting. In this paper, we propose Annealing Double-Head, a simple-to-implement but highly effective architecture for calibrating the DNN during training. To be precise, we construct an additional calibration head-a shallow neural network that typically has one latent layer-on top of the last latent layer in the normal model to map the logits to the aligned confidence. Furthermore, a simple Annealing technique that dynamically scales the logits by calibration head in training procedure is developed to improve its performance. Under both the in-distribution and distributional shift circumstances, we exhaustively evaluate our Annealing Double-Head architecture on multiple pairs of contemporary DNN architectures and vision and speech datasets. We demonstrate that our method achieves state-of-the-art model calibration performance without post-processing while simultaneously providing comparable predictive accuracy in comparison to other recently proposed calibration methods on a range of learning tasks.
translated by 谷歌翻译
Dense prediction tasks such as segmentation and detection of pathological entities hold crucial clinical value in the digital pathology workflow. However, obtaining dense annotations on large cohorts is usually tedious and expensive. Contrastive learning (CL) is thus often employed to leverage large volumes of unlabeled data to pre-train the backbone network. To boost CL for dense prediction, some studies have proposed variations of dense matching objectives in pre-training. However, our analysis shows that employing existing dense matching strategies on histopathology images enforces invariance among incorrect pairs of dense features and, thus, is imprecise. To address this, we propose a precise location-based matching mechanism that utilizes the overlapping information between geometric transformations to precisely match regions in two augmentations. Extensive experiments on two pretraining datasets (TCGA-BRCA, NCT-CRC-HE) and three downstream datasets (GlaS, CRAG, BCSS) highlight the superiority of our method in semantic and instance segmentation tasks. Our method outperforms previous dense matching methods by up to 7.2 % in average precision for detection and 5.6 % in average precision for instance segmentation tasks. Additionally, by using our matching mechanism in the three popular contrastive learning frameworks, MoCo-v2, VICRegL and ConCL, the average precision in detection is improved by 0.7 % to 5.2 % and the average precision in segmentation is improved by 0.7 % to 4.0 %, demonstrating its generalizability.
translated by 谷歌翻译
Modal verbs, such as "can", "may", and "must", are commonly used in daily communication to convey the speaker's perspective related to the likelihood and/or mode of the proposition. They can differ greatly in meaning depending on how they're used and the context of a sentence (e.g. "They 'must' help each other out." vs. "They 'must' have helped each other out.") Despite their practical importance in natural language understanding, linguists have yet to agree on a single, prominent framework for the categorization of modal verb senses. This lack of agreement stems from high degrees of flexibility and polysemy from the modal verbs, making it more difficult for researchers to incorporate insights from this family of words into their work. This work presents Moverb dataset, which consists of 27,240 annotations of modal verb senses over 4,540 utterances containing one or more sentences from social conversations. Each utterance is annotated by three annotators using two different theoretical frameworks (i.e., Quirk and Palmer) of modal verb senses. We observe that both frameworks have similar inter-annotator agreements, despite having different numbers of sense types (8 for Quirk and 3 for Palmer). With the RoBERTa-based classifiers fine-tuned on \dataset, we achieve F1 scores of 82.2 and 78.3 on Quirk and Palmer, respectively, showing that modal verb sense disambiguation is not a trivial task. Our dataset will be publicly available with our final version.
translated by 谷歌翻译